Model Selection

Low-precision Efficient Inference

# Low-precision Efficient Inference

Llama 4 Maverick 17B 128E Instruct 6bit

A 6-bit quantized version converted from the Meta Llama 4 model, supporting multilingual instruction interaction

Large Language Model

Transformers Supports Multiple Languages

Llama 4 Maverick 17B 128E Instruct FP8

The Llama 4 series is a multimodal AI model developed by Meta, supporting text and image interactions, utilizing a Mixture of Experts (MoE) architecture, and delivering industry-leading performance in text and image comprehension.

Transformers Supports Multiple Languages

Moondream2 Llamafile

moondream2 is a compact vision-language model specifically designed for efficient operation on edge devices, offering convenient deployment through the llamafile format.

Qwen 7B Chat GPTQ

A 7-billion-parameter large language model developed by Alibaba Cloud, based on the Transformer architecture, supporting both Chinese and English languages as well as code processing, with multi-turn dialogue capabilities.

Large Language Model

Transformers Supports Multiple Languages

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase